AITopics | Bishkek

Collaborating Authors

Bishkek

Syntactic Transfer to Kyrgyz Using the Treebank Translation Method

Alekseev, Anton, Tillabaeva, Alina, Kabaeva, Gulnara Dzh., Nikolenko, Sergey I.

arXiv.org Artificial IntelligenceDec-17-2024

The Kyrgyz language, as a low-resource language, requires significant effort to create high-quality syntactic corpora. This study proposes an approach to simplify the development process of a syntactic corpus for Kyrgyz. We present a tool for transferring syntactic annotations from Turkish to Kyrgyz based on a treebank translation method. The effectiveness of the proposed tool was evaluated using the TueCL treebank. The results demonstrate that this approach achieves higher syntactic annotation accuracy compared to a monolingual model trained on the Kyrgyz KTMU treebank. Additionally, the study introduces a method for assessing the complexity of manual annotation for the resulting syntactic trees, contributing to further optimization of the annotation process.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2412.13146

Country:

Asia > Russia (0.05)
Europe > Russia > Northwestern Federal District > Leningrad Oblast > Saint Petersburg (0.04)
Asia > Kyrgyzstan > Chüy Region > Bishkek (0.04)
(8 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

HJ-Ky-0.1: an Evaluation Dataset for Kyrgyz Word Embeddings

Alekseev, Anton, Kabaeva, Gulnara

arXiv.org Artificial IntelligenceNov-28-2024

One of the key tasks in modern applied computational linguistics is constructing word vector representations (word embeddings), which are widely used to address natural language processing tasks such as sentiment analysis, information extraction, and more. To choose an appropriate method for generating these word embeddings, quality assessment techniques are often necessary. A standard approach involves calculating distances between vectors for words with expert-assessed 'similarity'. This work introduces the first 'silver standard' dataset for such tasks in the Kyrgyz language, alongside training corresponding models and validating the dataset's suitability through quality evaluation metrics.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.56634/16948335.2023.4.1723-1731

2411.10724

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Germany > Saxony > Leipzig (0.09)
Asia > Russia (0.05)
(7 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.97)

Add feedback

KyrgyzNLP: Challenges, Progress, and Future

Alekseev, Anton, Turatali, Timur

arXiv.org Artificial IntelligenceNov-15-2024

Large language models (LLMs) have excelled in numerous benchmarks, advancing AI applications in both linguistic and non-linguistic tasks. However, this has primarily benefited well-resourced languages, leaving less-resourced ones (LRLs) at a disadvantage. In this paper, we highlight the current state of the NLP field in the specific LRL: kyrgyz tili. Human evaluation, including annotated datasets created by native speakers, remains an irreplaceable component of reliable NLP performance, especially for LRLs where automatic evaluations can fall short. In recent assessments of the resources for Turkic languages, Kyrgyz is labeled with the status 'Scraping By', a severely under-resourced language spoken by millions. This is concerning given the growing importance of the language, not only in Kyrgyzstan but also among diaspora communities where it holds no official status. We review prior efforts in the field, noting that many of the publicly available resources have only recently been developed, with few exceptions beyond dictionaries (the processed data used for the analysis is presented at https://kyrgyznlp.github.io/). While recent papers have made some headway, much more remains to be done. Despite interest and support from both business and government sectors in the Kyrgyz Republic, the situation for Kyrgyz language resources remains challenging. We stress the importance of community-driven efforts to build these resources, ensuring the future advancement sustainability. We then share our view of the most pressing challenges in Kyrgyz NLP. Finally, we propose a roadmap for future development in terms of research topics and language resources.

kyrgyz nlp, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2411.05503

Country:

Asia > Russia (0.14)
Europe > Germany > Saxony > Leipzig (0.05)
Asia > Kyrgyzstan > Chüy Region > Bishkek (0.04)
(19 more...)

Genre:

Research Report (1.00)
Overview > Growing Problem (0.34)

Industry:

Government (1.00)
Media > News (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(4 more...)

Add feedback

Assessing Large Language Models for Online Extremism Research: Identification, Explanation, and New Knowledge

Dong, Beidi, Lee, Jin R., Zhu, Ziwei, Srinivasan, Balassubramanian

arXiv.org Artificial IntelligenceAug-29-2024

The United States has experienced a significant increase in violent extremism, prompting the need for automated tools to detect and limit the spread of extremist ideology online. This study evaluates the performance of Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-Trained Transformers (GPT) in detecting and classifying online domestic extremist posts. We collected social media posts containing "far-right" and "far-left" ideological keywords and manually labeled them as extremist or non-extremist. Extremist posts were further classified into one or more of five contributing elements of extremism based on a working definitional framework. The BERT model's performance was evaluated based on training data size and knowledge transfer between categories. We also compared the performance of GPT 3.5 and GPT 4 models using different prompts: na\"ive, layperson-definition, role-playing, and professional-definition. Results showed that the best performing GPT models outperformed the best performing BERT models, with more detailed prompts generally yielding better results. However, overly complex prompts may impair performance. Different versions of GPT have unique sensitives to what they consider extremist. GPT 3.5 performed better at classifying far-left extremist posts, while GPT 4 performed better at classifying far-right extremist posts. Large language models, represented by GPT models, hold significant potential for online extremism classification tasks, surpassing traditional BERT models in a zero-shot setting. Future research should explore human-computer interactions in optimizing GPT models for extremist detection and classification tasks to develop more efficient (e.g., quicker, less effort) and effective (e.g., fewer errors or mistakes) methods for identifying extremist content.

classification task, extremism, extremist post, (16 more...)

arXiv.org Artificial Intelligence

2408.16749

Country:

Europe > Germany (0.14)
North America > United States > Virginia > Fairfax County > Fairfax (0.04)
North America > United States > Washington > King County > Bellevue (0.04)
(8 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Media (1.00)
Law > Civil Rights & Constitutional Law (1.00)
Law Enforcement & Public Safety > Terrorism (1.00)
(7 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Knowledge Graph Representation for Political Information Sources

Osmonova, Tinatin, Tikhonov, Alexey, Yamshchikov, Ivan P.

arXiv.org Artificial IntelligenceApr-4-2024

With the rise of computational social science, many scholars utilize data analysis and natural language processing tools to analyze social media, news articles, and other accessible data sources for examining political and social discourse. Particularly, the study of the emergence of echo-chambers due to the dissemination of specific information has become a topic of interest in mixed methods research areas. In this paper, we analyze data collected from two news portals, Breitbart News (BN) and New York Times (NYT) to prove the hypothesis that the formation of echo-chambers can be partially explained on the level of an individual information consumption rather than a collective topology of individuals' social networks. Our research findings are presented through knowledge graphs, utilizing a dataset spanning 11.5 years gathered from BN and NYT media portals. We demonstrate that the application of knowledge representation techniques to the aforementioned news streams highlights, contrary to common assumptions, shows relative "internal" neutrality of both sources and polarizing attitude towards a small fraction of entities. Additionally, we argue that such characteristics in information sources lead to fundamental disparities in audience worldviews, potentially acting as a catalyst for the formation of echo-chambers.

graph, knowledge graph, subjectivity, (14 more...)

arXiv.org Artificial Intelligence

2404.03437

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York (0.04)
Europe > Germany > Bavaria > Lower Franconia > Würzburg (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry:

Government (1.00)
Media > News (0.94)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.72)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.69)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.47)

Add feedback

HIVA: Holographic Intellectual Voice Assistant

Isaev, Ruslan, Gumerov, Radmir, Esenalieva, Gulzada, Mekuria, Remudin Reshid, Doszhanov, Ermek

arXiv.org Artificial IntelligenceJun-27-2023

Holographic Intellectual Voice Assistant (HIVA) aims to facilitate human computer interaction using audiovisual effects and 3D avatar. HIVA provides complete information about the university, including requests of various nature: admission, study issues, fees, departments, university structure and history, canteen, human resources, library, student life and events, information about the country and the city, etc. There are other ways for receiving the data listed above: the university's official website and other supporting apps, HEI (Higher Education Institution) official social media, directly asking the HEI staff, and other channels. However, HIVA provides the unique experience of "face-to-face" interaction with an animated 3D mascot, helping to get a sense of 'real-life' communication. The system includes many sub-modules and connects a family of applications such as mobile applications, Telegram chatbot, suggestion categorization, and entertainment services. The Voice assistant uses Russian language NLP models and tools, which are pipelined for the best user experience.

data mining, information, machine learning, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICECCO58239.2023.10146600

2307.05501

Country:

Asia > Kyrgyzstan > Chüy Region > Bishkek (0.05)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > Italy (0.04)
(5 more...)

Genre: Research Report (0.82)

Industry: Education > Educational Setting > Higher Education (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
(4 more...)

Add feedback

Applying Machine Learning Analysis for Software Quality Test

Khan, Al, Mekuria, Remudin Reshid, Isaev, Ruslan

arXiv.org Artificial IntelligenceMay-16-2023

One of the biggest expense in software development is the maintenance. Therefore, it is critical to comprehend what triggers maintenance and if it may be predicted. Numerous research have demonstrated that specific methods of assessing the complexity of created programs may produce useful prediction models to ascertain the possibility of maintenance due to software failures. As a routine it is performed prior to the release, and setting up the models frequently calls for certain, object-oriented software measurements. It is not always the case that software developers have access to these measurements. In this paper, the machine learning is applied on the available data to calculate the cumulative software failure levels. A technique to forecast a software`s residual defectiveness using machine learning can be looked into as a solution to the challenge of predicting residual flaws. Software metrics and defect data were separated out of the static source code repository. Static code is used to create software metrics, and reported bugs in the repository are used to gather defect information. By using a correlation method, metrics that had no connection to the defect data were removed. This makes it possible to analyze all the data without pausing the programming process. Large, sophisticated software`s primary issue is that it is impossible to control everything manually, and the cost of an error can be quite expensive. Developers may miss errors during testing as a consequence, which will raise maintenance costs. Finding a method to accurately forecast software defects is the overall objective.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICCQ57276.2023.10114664

2305.09695

Country: Asia > Kyrgyzstan > Chüy Region > Bishkek (0.04)

Genre: Research Report > New Finding (0.88)

Industry: Information Technology (0.48)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
(2 more...)

Add feedback

Electromyography Signal Classification Using Deep Learning

Gaso, Mekia Shigute, Cankurt, Selcuk, Subasi, Abdulhamit

arXiv.org Artificial IntelligenceMay-6-2023

We have implemented a deep learning model with L2 regularization and trained it on Electromyography (EMG) data. The data comprises of EMG signals collected from control group, myopathy and ALS patients. Our proposed deep neural network consists of eight layers; five fully connected, two batch normalization and one dropout layers. The data is divided into training and testing sections by subsequently dividing the training data into sub-training and validation sections. Having implemented this model, an accuracy of 99 percent is achieved on the test data set. The model was able to distinguishes the normal cases (control group) from the others at a precision of 100 percent and classify the myopathy and ALS with high accuracy of 97.4 and 98.2 percents, respectively. Thus we believe that, this highly improved classification accuracies will be beneficial for their use in the clinical diagnosis of neuromuscular disorders.

artificial intelligence, emg signal, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICECCO53203.2021.9663803

2305.04006

Country:

Europe > Finland > Southwest Finland > Turku (0.04)
Asia > Kyrgyzstan > Chüy Region > Bishkek (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Morphological Classification of Galaxies Using SpinalNet

Shaiakhmetov, Dim, Mekuria, Remudin Reshid, Isaev, Ruslan, Unsal, Fatma

arXiv.org Artificial IntelligenceMay-2-2023

Deep neural networks (DNNs) with a step-by-step introduction of inputs, which is constructed by imitating the somatosensory system in human body, known as SpinalNet have been implemented in this work on a Galaxy Zoo dataset. The input segmentation in SpinalNet has enabled the intermediate layers to take some of the inputs as well as output of preceding layers thereby reducing the amount of the collected weights in the intermediate layers. As a result of these, the authors of SpinalNet reported to have achieved in most of the DNNs they tested, not only a remarkable cut in the error but also in the large reduction of the computational costs. Having applied it to the Galaxy Zoo dataset, we are able to classify the different classes and/or sub-classes of the galaxies. Thus, we have obtained higher classification accuracies of 98.2, 95 and 82 percents between elliptical and spirals, between these two and irregulars, and between 10 sub-classes of galaxies, respectively.

artificial intelligence, classification, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICECCO53203.2021.9663784

2305.01873

Country:

Asia > Kyrgyzstan > Chüy Region > Bishkek (0.05)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Spain > Aragón (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

Add feedback

Coronavirus: The strangers reaching out to Kyrgyzstan's lonely teenagers

BBC NewsMay-26-2020, 10:12:16 GMT

Like teenagers around the world, Maksat hasn't been to school in weeks. As Kyrgyzstan imposed quarantine restrictions, the 15-year-old feels isolated like never before. He has been trapped at home with a sister he doesn't get on with, a father he struggles to communicate with and a mother working abroad. He is comfortable talking only to an internet chat bot. Maksat (not his real name) feels alone and misunderstood.

artificial intelligence, kyrgyzstan, teenager, (15 more...)

BBC News

Country:

Europe > Hungary > Budapest > Budapest (0.05)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.05)
Asia > Kyrgyzstan > Chüy Region > Bishkek (0.05)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.43)
Health & Medicine > Therapeutic Area > Immunology (0.43)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.32)

Technology: Information Technology > Artificial Intelligence (0.35)

Add feedback